Targeted estimation of nuisance parameters to obtain valid statistical inference.
نویسنده
چکیده
In order to obtain concrete results, we focus on estimation of the treatment specific mean, controlling for all measured baseline covariates, based on observing independent and identically distributed copies of a random variable consisting of baseline covariates, a subsequently assigned binary treatment, and a final outcome. The statistical model only assumes possible restrictions on the conditional distribution of treatment, given the covariates, the so-called propensity score. Estimators of the treatment specific mean involve estimation of the propensity score and/or estimation of the conditional mean of the outcome, given the treatment and covariates. In order to make these estimators asymptotically unbiased at any data distribution in the statistical model, it is essential to use data-adaptive estimators of these nuisance parameters such as ensemble learning, and specifically super-learning. Because such estimators involve optimal trade-off of bias and variance w.r.t. the infinite dimensional nuisance parameter itself, they result in a sub-optimal bias/variance trade-off for the resulting real-valued estimator of the estimand. We demonstrate that additional targeting of the estimators of these nuisance parameters guarantees that this bias for the estimand is second order and thereby allows us to prove theorems that establish asymptotic linearity of the estimator of the treatment specific mean under regularity conditions. These insights result in novel targeted minimum loss-based estimators (TMLEs) that use ensemble learning with additional targeted bias reduction to construct estimators of the nuisance parameters. In particular, we construct collaborative TMLEs (C-TMLEs) with known influence curve allowing for statistical inference, even though these C-TMLEs involve variable selection for the propensity score based on a criterion that measures how effective the resulting fit of the propensity score is in removing bias for the estimand. As a particular special case, we also demonstrate the required targeting of the propensity score for the inverse probability of treatment weighted estimator using super-learning to fit the propensity score.
منابع مشابه
W. NIEMIRO (Warszawa) ESTIMATION OF NUISANCE PARAMETERS FOR INFERENCE BASED ON LEAST ABSOLUTE DEVIATIONS
Statistical inference procedures based on least absolute deviations involve estimates of a matrix which plays the role of a multivariate nuisance parameter. To estimate this matrix, we use kernel smoothing. We show consistency and obtain bounds on the rate of convergence.
متن کاملLocation Reparameterization and Default Priors for Statistical Analysis
This paper develops default priors for Bayesian analysis that reproduce familiar frequentist and Bayesian analyses for models that are exponential or location. For the vector parameter case there is an information adjustment that avoids the Bayesian marginalization paradoxes and properly targets the prior on the parameter of interest thus adjusting for any complicating nonlinearity the details ...
متن کاملTESTING STATISTICAL HYPOTHESES UNDER FUZZY DATA AND BASED ON A NEW SIGNED DISTANCE
This paper deals with the problem of testing statisticalhypotheses when the available data are fuzzy. In this approach, wefirst obtain a fuzzy test statistic based on fuzzy data, and then,based on a new signed distance between fuzzy numbers, we introducea new decision rule to accept/reject the hypothesis of interest.The proposed approach is investigated for two cases: the casewithout nuisance p...
متن کاملSupplementary Material for “ Statistical Inference for Exploratory Data Analysis and Model Diagnostics ”
An issue in simulation-based statistical inference is that simulation requires a single distribution to sample from, whereas null hypotheses H0 of interest are always composite and hence cannot be simulated directly. The problem of composite null hypotheses is also called the problem of nuisance parameters. Examples of nuisance parameters are the following: In EDA, when the null hypothesis is i...
متن کاملEstimation for the Type-II Extreme Value Distribution Based on Progressive Type-II Censoring
In this paper, we discuss the statistical inference on the unknown parameters and reliability function of type-II extreme value (EVII) distribution when the observed data are progressively type-II censored. By applying EM algorithm, we obtain maximum likelihood estimates (MLEs). We also suggest approximate maximum likelihood estimators (AMLEs), which have explicit expressions. We provide Bayes ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- The international journal of biostatistics
دوره 10 1 شماره
صفحات -
تاریخ انتشار 2014